AITopics | tensor processing unit

Collaborating Authors

tensor processing unit

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

25 Tech Insiders on the Innovations Defining American Life Today

TIME - TechMay-5-2026, 13:00:17 GMT

Apple's iPhone is unveiled at a press conference in central London in September 2007. When iPhone arrived in 2007, it didn't just change technology. It changed how we live. That same year, TIME named it the Invention of the Year and called it "the phone that forever changed phones." But what mattered most wasn't just what iPhone was, but what it made possible.

artificial intelligence, press release, smartphone, (14 more...)

TIME - Tech

Country: North America > United States > California (0.47)

Genre: Press Release (0.34)

Industry:

Energy > Renewable > Solar (0.69)
Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Mobile (0.77)

Add feedback

Why Google's custom AI chips are shaking up the tech industry

New ScientistNov-28-2025, 16:00:11 GMT

Why Google's custom AI chips are shaking up the tech industry Ironwood is Google's latest tensor processing unit Nvidia's position as the dominant supplier of AI chips may be under threat from a specialised chip pioneered by Google, with reports suggesting companies like Meta and Anthropic are looking to spend billions on Google's tensor processing units. The success of the artificial intelligence industry has been in large part based on graphical processing units (GPUs), a kind of computer chip that can perform many parallel calculations at the same time, rather than one after the other like the computer processing units (CPUs) that power most computers. 'Flashes of brilliance and frustration': I let an AI agent run my day GPUs were originally developed to assist with computer graphics, as the name suggests, and gaming. "If I have a lot of pixels in a space and I need to do a rotation of this to calculate a new camera view, this is an operation that can be done in parallel, for many different pixels," says Francesco Conti at the University of Bologna in Italy. This ability to do calculations in parallel happened to be useful for training and running AI models, which often use calculations involving vast grids of numbers performed at the same time, called matrix multiplication.

artificial intelligence, machine learning, social media, (17 more...)

New Scientist

Country:

Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.25)
Europe > United Kingdom > England > Bristol (0.05)
Africa (0.05)

Industry: Information Technology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.72)

Add feedback

Flex-TPU: A Flexible TPU with Runtime Reconfigurable Dataflow Architecture

Elbtity, Mohammed, Chandarana, Peyton, Zand, Ramtin

arXiv.org Artificial IntelligenceJul-11-2024

Tensor processing units (TPUs) are one of the most well-known machine learning (ML) accelerators utilized at large scale in data centers as well as in tiny ML applications. TPUs offer several improvements and advantages over conventional ML accelerators, like graphical processing units (GPUs), being designed specifically to perform the multiply-accumulate (MAC) operations required in the matrix-matrix and matrix-vector multiplies extensively present throughout the execution of deep neural networks (DNNs). Such improvements include maximizing data reuse and minimizing data transfer by leveraging the temporal dataflow paradigms provided by the systolic array architecture. While this design provides a significant performance benefit, the current implementations are restricted to a single dataflow consisting of either input, output, or weight stationary architectures. This can limit the achievable performance of DNN inference and reduce the utilization of compute units. Therefore, the work herein consists of developing a reconfigurable dataflow TPU, called the Flex-TPU, which can dynamically change the dataflow per layer during run-time. Our experiments thoroughly test the viability of the Flex-TPU comparing it to conventional TPU designs across multiple well-known ML workloads. The results show that our Flex-TPU design achieves a significant performance increase of up to 2.75x compared to conventional TPU, with only minor area and power overheads.

architecture, dataflow, systolic array, (16 more...)

arXiv.org Artificial Intelligence

2407.087

Country:

North America > United States > South Carolina > Richland County > Columbia (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Services (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

HiP Attention: Sparse Sub-Quadratic Attention with Hierarchical Attention Pruning

Lee, Heejun, Park, Geon, Lee, Youngwan, Kim, Jina, Jeong, Wonyoung, Jeon, Myeongjae, Hwang, Sung Ju

arXiv.org Artificial IntelligenceJun-14-2024

In modern large language models (LLMs), increasing sequence lengths is a crucial challenge for enhancing their comprehension and coherence in handling complex tasks such as multi-modal question answering. However, handling long context sequences with LLMs is prohibitively costly due to the conventional attention mechanism's quadratic time and space complexity, and the context window size is limited by the GPU memory. Although recent works have proposed linear and sparse attention mechanisms to address this issue, their real-world applicability is often limited by the need to re-train pre-trained models. In response, we propose a novel approach, Hierarchically Pruned Attention (HiP), which simultaneously reduces the training and inference time complexity from $O(T^2)$ to $O(T \log T)$ and the space complexity from $O(T^2)$ to $O(T)$. To this end, we devise a dynamic sparse attention mechanism that generates an attention mask through a novel tree-search-like algorithm for a given query on the fly. HiP is training-free as it only utilizes the pre-trained attention scores to spot the positions of the top-$k$ most significant elements for each query. Moreover, it ensures that no token is overlooked, unlike the sliding window-based sub-quadratic attention methods, such as StreamingLLM. Extensive experiments on diverse real-world benchmarks demonstrate that HiP significantly reduces prompt (i.e., prefill) and decoding latency and memory usage while maintaining high generation performance with little or no degradation. As HiP allows pretrained LLMs to scale to millions of tokens on commodity GPUs with no additional engineering due to its easy plug-and-play deployment, we believe that our work will have a large practical impact, opening up the possibility to many long-context LLM applications previously infeasible.

complexity, latency, streamingllm, (14 more...)

arXiv.org Artificial Intelligence

2406.09827

Country:

North America > United States (0.46)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre: Research Report > Promising Solution (0.65)

Industry:

Government > Regional Government (0.93)
Government > Military (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Google's new AI supercomputer is 'a unique approach to AI development, claims expert

#artificialintelligenceApr-7-2023, 14:20:11 GMT

Google recently announced they have developed a unique artificial intelligence (AI) supercomputer that is faster, more efficient, and more powerful than NVIDIA systems. Nvidia is the reigning champion of AI model training and deployment, dominating over 90% of the market, according to CNBC. The great AI race has been raging on for a while now in Big Tech, and Google has been developing AI chips called Tensor Processing Units (TPUs) since 2016. "Google has chosen a unique approach to AI development by creating its own'Tensor Processing Unit' (TPU) architecture, rather than relying on specialised GPUs [graphic processing units] from Nvidia," founder of Elo AI, Matt Falconer explains. "This decision allows Google to reduce their dependence on third-party vendors and achieve vertical integration across its entire AI stack," Falconer added.

ai supercomputer, google, supercomputer, (11 more...)

#artificialintelligence

Industry: Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.78)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.35)

Add feedback

Coffee corner: are deep learning's returns diminishing?

AIHubNov-3-2021, 11:42:23 GMT

This month, we discuss an article that appeared recently in IEEE Spectrum entitled: Deep learning's diminishing returns. The article reports that deep-learning models are becoming more and more accurate, but the computing power needed to achieve this accuracy is increasing at such a rate that, to further reduce the error rates, the cost and environmental impact is going to be unsustainably high. Joining the discussion this time are: Tom Dietterich (Oregon State University), Stephen Hanson (Rutgers University), Sabine Hauert (University of Bristol), and Sarit Kraus (Bar-Ilan University). Sarit Kraus: I would like to start by considering the research aspect. Suppose a PhD student has a great idea about how to improve some machine learning algorithm. So now, they need to show that this improved algorithm is much better than all those before.

algorithm, deep learning, learning, (16 more...)

AIHub

Country: North America > United States > Oregon (0.24)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hardware Acceleration of Explainable Machine Learning using Tensor Processing Units

#artificialintelligenceMar-24-2021, 01:43:45 GMT

explainable machine learning, hardware acceleration, tensor processing unit, (3 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Hardware Acceleration of Explainable Machine Learning using Tensor Processing Units

Pan, Zhixin, Mishra, Prabhat

arXiv.org Artificial IntelligenceMar-22-2021

Machine learning (ML) is successful in achieving human-level performance in various fields. However, it lacks the ability to explain an outcome due to its black-box nature. While existing explainable ML is promising, almost all of these methods focus on formatting interpretability as an optimization problem. Such a mapping leads to numerous iterations of time-consuming complex computations, which limits their applicability in real-time applications. In this paper, we propose a novel framework for accelerating explainable ML using Tensor Processing Units (TPUs). The proposed framework exploits the synergy between matrix convolution and Fourier transform, and takes full advantage of TPU's natural ability in accelerating matrix computations. Specifically, this paper makes three important contributions. (1) To the best of our knowledge, our proposed work is the first attempt in enabling hardware acceleration of explainable ML using TPUs. (2) Our proposed approach is applicable across a wide variety of ML algorithms, and effective utilization of TPU-based acceleration can lead to real-time outcome interpretation. (3) Extensive experimental results demonstrate that our proposed approach can provide an order-of-magnitude speedup in both classification time (25x on average) and interpretation time (13x on average) compared to state-of-the-art techniques.

computation, interpretation, matrix, (12 more...)

arXiv.org Artificial Intelligence

2103.11927

Country: North America > United States > Florida > Alachua County > Gainesville (0.14)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Data Science > Data Quality > Data Transformation (0.38)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.34)

Add feedback

Powerful Photon-Based Processing Units Enable Complex Artificial Intelligence

#artificialintelligenceAug-3-2020, 09:00:08 GMT

The photonic tensor core performs vector-matrix multiplications by utilizing the efficient interaction of light at different wavelengths with multistate photonic phase change memories. Using photons to create more powerful and power-efficient processing units for more complex machine learning. Machine learning performed by neural networks is a popular approach to developing artificial intelligence, as researchers aim to replicate brain functionalities for a variety of applications. A paper in the journal Applied Physics Reviews, by AIP Publishing, proposes a new approach to perform computations required by a neural network, using light instead of electricity. In this approach, a photonic tensor core performs multiplications of matrices in parallel, improving speed and efficiency of current deep learning paradigms.

machine learning, powerful photon-based processing unit enable, unit enable complex artificial intelligence, (8 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.86)

Add feedback

Google claims its new TPUs are 2.7 times faster than the previous generation

#artificialintelligenceJul-30-2020, 17:01:21 GMT

Google's fourth-generation tensor processing units (TPUs), the existence of which weren't publicly revealed until today, can complete AI and machine learning training workloads in close-to-record wall clock time. That's according to the latest set of metrics released by MLPerf, the consortium of over 70 companies and academic institutions behind the MLPerf suite for AI performance benchmarking. It shows clusters of fourth-gen TPUs surpassing the capabilities of third-generation TPUs -- and even those of Nvidia's recently released A100 -- on object detection, image classification, natural language processing, machine translation, and recommendation benchmarks. Google says its fourth-generation TPU offers more than double the matrix multiplication TFLOPs of a third-generation TPU, where a single TFLOP is equivalent to 1 trillion floating-point operations per second. It also offers a "significant" boost in memory bandwidth while benefiting from unspecified advances in interconnect technology.

artificial intelligence, machine learning, natural language, (16 more...)

#artificialintelligence

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.36)

Add feedback